Chhattisgarhi Raw Speech Corpus

0 reviews requests (1)

Owner Central Institute of Indian Languages

Catalogue Number: 1436

Stock In Stock

OverView

Please Login to see the price

Tags: Chhattisgarhi Speech Corpus

Categories Cart Account Search Recent View Go to Top

Dataset Description

Dataset Description:

LDC-IL has taken a positive step in its approach towards the mother tongues spoken in India, which is an indication of greater efforts to support and promote linguistic variety in the nation. Collection of Chhattisgarhi speech data is a major effort in this approach. This step towards developing language technology for Indian mother tongues will contribute to the overall enrichment and empowerment of mother tongues.

The Chhattisgarhi raw speech corpus is made up of recordings of native Chhattisgarhi speakers from various parts of the state of Chhattisgarh, and it represents a wide range of Chhattisgarhi varieties as they are spoken in various locations by diverse speakers. Each speaker from various age groups recites prompt text extracts of literary and news texts. Along with this, Spontaneous Speech has also been collected.

A detailed explanation of the Chhattisgarhi Raw Speech Corpus will be available in the Chhattisgarhi Raw Speech Data Documentation.

For any research-based citations, please use the following citations:

1. Satyaendra Kumar Awasthi, Ankita Tiwari, Narayan Kumar Choudhary. 2023. Chhattisgarhi Raw Speech Corpus. Central Institute of Indian Languages, Mysore.

2. Choudhary, Narayan, Rajesha N., Manasa G. & L. Ramamoorthy. 2019. “LDC-IL Raw Speech Corpora: An Overview” in Linguistic Resources for AI/NLP in Indian Languages. Central Institute of Indian Languages, Mysore. pp. 160-174.

3. Choudhary, N. 2021. LDC-IL: The Indian Repository of Resources for Language Technology. Language Resources & Evaluation. Springer, Vol. 55, Issue 1. doi: https://doi.org/10.1007/s10579-020-09523-3

Item specifics

Authors Satyaendra Kumar Awasthi, Ankita Tiwari, Shantanu Kumar, Rupesh Pandey, Saurabh Varik, Rajesha N., Manasa G., Srikanth D., Nithin S., Narayan Kumar Choudhary, Shailendra Mohan
Corpus Type Raw Speech Corpus
Catalogue Number 1436
ISBN 978-81-19411-78-8
Data Source On Field
Duration 138:09:27
# of Audio Segments 359
Release Date 8-Jan-24
Terms and Conditions General instructions for use of the resources provided by LDC-IL.

Chhattisgarhi Raw Speech Corpus

OverView

Chhattisgarhi Raw Speech Corpus

Dataset Description

Item specifics

Write a review